AITopics | audio-visual association

Collaborating Authors

audio-visual association

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

See, Hear, Explore: Curiosity via Audio-Visual Association

Neural Information Processing SystemsDec-24-2025, 10:33:30 GMT

Exploration is one of the core challenges in reinforcement learning. A common formulation of curiosity-driven exploration uses the difference between the real future and the future predicted by a learned model. However, predicting the future is an inherently difficult task which can be ill-posed in the face of stochasticity. In this paper, we introduce an alternative form of curiosity that rewards novel associations between different senses. Our approach exploits multiple modalities to provide a stronger signal for more efficient exploration. Our method is inspired by the fact that, for humans, both sight and sound play a critical role in exploration.

audio-visual association, curiosity, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.79)

Add feedback

Review for NeurIPS paper: See, Hear, Explore: Curiosity via Audio-Visual Association

Neural Information Processing SystemsJan-27-2025, 12:18:22 GMT

Weaknesses: My biggest concern with this paper is the treatment of error as reward, or as this paper refers to it, "curiosity by self-supervised prediction." The "couch-potato" issues associated with using error as reward (described in lines 117-121) have been known for decades (e.g., Schmidhuber, 1991, towards the end of Section 3) yet we seem to have to keep re-discovering them. Can you address why it makes sense to use error as reward in your setting despite this problem? It seems particularly concerning since a stated "longer-term goal is to deploy multimodal curiosity on physical robots," a setting with inherent stochasticity. Could you please provide some reasons why you believe that "discovering new sight and sound associations" (lines 122-123) could mitigate the couch-potato problem?

audio-visual association, couch-potato problem, neurips paper, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Robots (0.58)

Add feedback

See, Hear, Explore: Curiosity via Audio-Visual Association

Neural Information Processing SystemsOct-11-2024, 02:09:17 GMT

audio-visual association, curiosity, exploration

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.30)

Add feedback

See, Hear, Explore: curiosity via audio-visual association

AIHubJun-29-2021, 12:26:11 GMT

To compute audio features, we take an audio clip spanning 4 time steps (th of a second for these 60 frame per second environments) and apply a Fast Fourier Transform (FFT). The FFT output is downsampled using max pooling to a 512-dimensional feature vector, which is used as input to the discriminator along with a 512-dimensional visual feature vector.

agent, discriminator, exploration, (17 more...)

AIHub

Industry: Leisure & Entertainment > Games > Computer Games (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.71)
Information Technology > Data Science > Data Quality > Data Transformation (0.54)

Add feedback